16 research outputs found

    Publishing Microdata with a Robust Privacy Guarantee

    Full text link
    Today, the publication of microdata poses a privacy threat. Vast research has striven to define the privacy condition that microdata should satisfy before it is released, and devise algorithms to anonymize the data so as to achieve this condition. Yet, no method proposed to date explicitly bounds the percentage of information an adversary gains after seeing the published data for each sensitive value therein. This paper introduces beta-likeness, an appropriately robust privacy model for microdata anonymization, along with two anonymization schemes designed therefor, the one based on generalization, and the other based on perturbation. Our model postulates that an adversary's confidence on the likelihood of a certain sensitive-attribute (SA) value should not increase, in relative difference terms, by more than a predefined threshold. Our techniques aim to satisfy a given beta threshold with little information loss. We experimentally demonstrate that (i) our model provides an effective privacy guarantee in a way that predecessor models cannot, (ii) our generalization scheme is more effective and efficient in its task than methods adapting algorithms for the k-anonymity model, and (iii) our perturbation method outperforms a baseline approach. Moreover, we discuss in detail the resistance of our model and methods to attacks proposed in previous research.Comment: VLDB201

    Integrative Dynamic Reconfiguration in a Parallel Stream Processing Engine

    Get PDF
    Load balancing, operator instance collocations and horizontal scaling are critical issues in Parallel Stream Processing Engines to achieve low data processing latency, optimized cluster utilization and minimized communication cost respectively. In previous work, these issues are typically tackled separately and independently. We argue that these problems are tightly coupled in the sense that they all need to determine the allocations of workloads and migrate computational states at runtime. Optimizing them independently would result in suboptimal solutions. Therefore, in this paper, we investigate how these three issues can be modeled as one integrated optimization problem. In particular, we first consider jobs where workload allocations have little effect on the communication cost, and model the problem of load balance as a Mixed-Integer Linear Program. Afterwards, we present an extended solution called ALBIC, which support general jobs. We implement the proposed techniques on top of Apache Storm, an open-source Parallel Stream Processing Engine. The extensive experimental results over both synthetic and real datasets show that our techniques clearly outperform existing approaches

    Privacy-Preserving Data Publication for Static and Streaming Data

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH

    Discrete Particle Swarm Optimization Routing Protocol for Wireless Sensor Networks with Multiple Mobile Sinks

    No full text
    Mobile sinks can achieve load-balancing and energy-consumption balancing across the wireless sensor networks (WSNs). However, the frequent change of the paths between source nodes and the sinks caused by sink mobility introduces significant overhead in terms of energy and packet delays. To enhance network performance of WSNs with mobile sinks (MWSNs), we present an efficient routing strategy, which is formulated as an optimization problem and employs the particle swarm optimization algorithm (PSO) to build the optimal routing paths. However, the conventional PSO is insufficient to solve discrete routing optimization problems. Therefore, a novel greedy discrete particle swarm optimization with memory (GMDPSO) is put forward to address this problem. In the GMDPSO, particle’s position and velocity of traditional PSO are redefined under discrete MWSNs scenario. Particle updating rule is also reconsidered based on the subnetwork topology of MWSNs. Besides, by improving the greedy forwarding routing, a greedy search strategy is designed to drive particles to find a better position quickly. Furthermore, searching history is memorized to accelerate convergence. Simulation results demonstrate that our new protocol significantly improves the robustness and adapts to rapid topological changes with multiple mobile sinks, while efficiently reducing the communication overhead and the energy consumption

    Efficient tree pattern queries on encrypted XML documents

    No full text
    Outsourcing XML documents is a challenging task, because it encrypts the documents, while still requiring efficient query processing. Past approaches on this topic either leak structural information or fail to support searching that has constraints on XML node content. In addition, they adopt a filtering-and-refining framework, which requires the users to prune false positives from the query results. To address these problems, we present a solution for efficient evaluation of tree pattern queries (TPQs) on encrypted XML documents. We create a domain hierarchy, such that each XML document can be embedded in it. By assigning each node in the hierarchy a position, we create for each document a vector, which encodes both the structural and textual information about the document. Similarly, a vector is created also for a TPQ. Then, the matching between a TPQ and a document is reduced to calculating the distance between their vectors. For the sake of privacy, such vectors are encrypted before being outsourced. To improve the matching efficiency, we use a k-d tree to partition the vectors into non-overlapping subsets, such that non-matchable documents are pruned as early as possible. The extensive evaluation shows that our solution is efficient and scalable to large dataset
    corecore